TREC Dynamic Domain

نویسندگان

  • Annie Bryant Burgess
  • Chris Mattmann
  • Giuseppe Totaro
  • Lewis John McGibbney
  • Paul Ramirez
چکیده

This paper outlines the creation of the Polar dataset within the TREC-Dynamic Domain track. The techniques used to create the Polar dataset fall into two basic categories: information extraction using Apache Tika and information retrieval using Apache Nutch. Frist, we expanded the parsing capabilities of Apache Tika, an open source framework for text and metadata extraction, to provide more searchable content within Polar data repositories. Second, we used Apache Nutch, a distributed search engine that runs on top of Apache Hadoop, to crawl three prominent Polar data repositories: the National Science Foundation Advanced Cooperative Arctic Data and Information System (ACADIS), the National Snow and Ice Data Center (NSIDC) Arctic Data Explorer (ADE), and the National Aeronautics and Space Administration Antarctic Master Directory (AMD). Because finding data is often a primary challenge in scientific discovery, the inclusion of the Polar dataset in TREC-DD helps advance science through data discovery and provides TREC-DD a new challenge in in the realm of search relevancy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UFMG at the TREC 2016 Dynamic Domain track

In TREC 2016, we focus on tackling the challenges posed by the Dynamic Domain (DD) track. The goal of the TREC DD track is to support research in dynamic, exploratory search within a complex domain. To this end, our participation investigates the suitability of multiple diversification approaches for dynamic information retrieval. In particular, based on fine-grained real-time feedback obtained...

متن کامل

University of Glasgow at TREC 2015: Experiments with Terrier in Contextual Suggestion, Temporal Summarisation and Dynamic Domain Tracks

In TREC 2015, we focus on tackling the challenges posed by the Contextual Suggestion, Temporal Summarisation and Dynamic Domain tracks. For Contextual Suggestion, we investigate the use of user-generated data in location-based social networks (LBSN) to suggest venues. For Temporal Summarisation, we examine features for event summarisation that explicitly model the entities involved in the event...

متن کامل

RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

The TREC Dynamic Domain search task addresses search scenarios where users engage interactively with search systems to tackle domain specific information needs. In our participation, we focused on utilizing passage-based representations in document retrieval and user feedback processing. In addition, we submitted a baseline retrieval method and a manual run that considers only relevant document...

متن کامل

TREC 2015 Dynamic Domain Track Overview

Search tasks for professional searchers, such as law enforcement agen-cies, police officers, and patent examiners, are often more complex thanopen domain Web search tasks. When professional searchers look for rele-vant information, it is often the case that they need to go through multipleiterations of searches to interact with a system. The Dynamic DomainTrack supports rese...

متن کامل

Laval University at TREC Dynamic Domain 2016: Subtopic extraction focused on Named Entities

— This paper describes the results submitted by Laval University to the TREC 2016 Dynamic Domain track. We submitted five runs. For this year we decided to focus around Named Entities to interpret the subtopics. Named Entities are one of the two types of queries we identified in this challenge, the other being queries about concepts. We describe in this paper our experiments to determine if tar...

متن کامل

TREC 2016 Dynamic Domain Track Overview

The main goal of TREC dynamic domain track is to explore and evaluate systems for interactive information retrieval, which reflects real-world search scenarios. Due to the importance of learning from user interactions, this track has been held for the second year. The name of the track contains two parts. “Dynamic” means that the search system dynamically adapts the provided ranking to the user...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015